智能论文笔记

Attribute Inference Attacks in Online Multiplayer Video Games: a Case Study on Dota2

Pier Paolo Tricomi , Lisa Facciolo , Giovanni Apruzzese , Mauro Conti

分类：机器学习

2022-10-17

Did you know that over 70 million of Dota2 players have their in-game data freely accessible? What if such data is used in malicious ways? This paper is the first to investigate such a problem. Motivated by the widespread popularity of video games, we propose the first threat model for Attribute Inference Attacks (AIA) in the Dota2 context. We explain how (and why) attackers can exploit the abundant public data in the Dota2 ecosystem to infer private information about its players. Due to lack of concrete evidence on the efficacy of our AIA, we empirically prove and assess their impact in reality. By conducting an extensive survey on $\sim$500 Dota2 players spanning over 26k matches, we verify whether a correlation exists between a player's Dota2 activity and their real-life. Then, after finding such a link ($p\!<\!0.01$ and $\rho>0.3$), we ethically perform diverse AIA. We leverage the capabilities of machine learning to infer real-life attributes of the respondents of our survey by using their publicly available in-game data. Our results show that, by applying domain expertise, some AIA can reach up to 98% precision and over 90% accuracy. This paper hence raises the alarm on a subtle, but concrete threat that can potentially affect the entire competitive gaming landscape. We alerted the developers of Dota2.

translated by 谷歌翻译

Captcha Attack: Turning Captchas Against Humanity

Mauro Conti , Luca Pajola , Pier Paolo Tricomi

分类：计算机视觉 | 机器学习

2022-01-11

如今，人们在网上平台上生成并分享大量内容（例如，社交网络，博客）。 2021年，每分钟为119亿日常积极的Facebook用户发布了大约15万张照片。内容主持人不断监控这些在线平台，以防止扩散不适当的内容（例如，讨厌语音，裸露图像）。基于深度学习（DL）的进步，自动内容主持人（ACM）帮助人类主持人处理高数据量。尽管他们的优势，攻击者可以利用DL组件的弱点（例如，预处理，模型）来影响其性能。因此，攻击者可以利用这些技术来通过逃避ACM来扩散不适当的内容。在这项工作中，我们提出了CAPTCHA攻击（CAPA），这是一种允许用户通过逃避ACM控件来扩散不恰当的文本的对抗技术。通过生成自定义文本CAPTCHAS的CAPA，利用ACM的粗心设计实现和内部程序漏洞。我们对现实世界ACM的攻击进行了测试，结果证实了我们简单但有效攻击的凶猛，在大多数情况下达到了100％的逃避成功。与此同时，我们展示了设计CAPA缓解，在CAPTCHAS研究区开辟了新挑战的困难。

translated by 谷歌翻译

Generalized Many-Body Dispersion Correction through Random-phase Approximation for Chemically Accurate Density Functional Theory

Pier Paolo Poier , Louis Lagardère , Jean-Philip Piquemal

分类：机器学习

2022-10-18

We extend our recently proposed Deep Learning-aided many-body dispersion (DNN-MBD) model to quadrupole polarizability (Q) terms using a generalized Random Phase Approximation (RPA) formalism, thus enabling the inclusion of van der Waals contributions beyond dipole. The resulting DNN-MBDQ model only relies on ab initio-derived quantities as the introduced quadrupole polarizabilities are recursively retrieved from dipole ones, in turn modelled via the Tkatchenko-Scheffler method. A transferable and efficient deep-neuronal network (DNN) provides atom in molecule volumes, while a single range-separation parameter is used to couple the model to Density Functional Theory (DFT). Since it can be computed at a negligible cost, the DNN-MBDQ approach can be coupled with DFT functionals such as PBE,PBE0 and B86bPBE (dispersionless). The DNN-MBQ-corrected functionals reach chemical accuracy while exhibiting lower errors compared to the DNN-MBD dipole-only counterparts as well as to other MBD-based dispersion correction models where the accuracy gain can reache up to 45%.

translated by 谷歌翻译

Multimodal Explainability via Latent Shift applied to COVID-19 stratification

Valerio Guarrasi , Lorenzo Tronchin , Domenico Albano , Eliodoro Faiella , Deborah Fazzini , Domiziana Santucci , Paolo Soda

分类：人工智能 | 机器学习

2022-12-28

We are witnessing a widespread adoption of artificial intelligence in healthcare. However, most of the advancements in deep learning (DL) in this area consider only unimodal data, neglecting other modalities. Their multimodal interpretation necessary for supporting diagnosis, prognosis and treatment decisions. In this work we present a deep architecture, explainable by design, which jointly learns modality reconstructions and sample classifications using tabular and imaging data. The explanation of the decision taken is computed by applying a latent shift that, simulates a counterfactual prediction revealing the features of each modality that contribute the most to the decision and a quantitative score indicating the modality importance. We validate our approach in the context of COVID-19 pandemic using the AIforCOVID dataset, which contains multimodal data for the early identification of patients at risk of severe outcome. The results show that the proposed method provides meaningful explanations without degrading the classification performance.

translated by 谷歌翻译

Simple Yet Surprisingly Effective Training Strategies for LSTMs in Sensor-Based Human Activity Recognition

Shuai Shao , Yu Guan , Xin Guan , Paolo Missier , Thomas Ploetz

分类：机器学习

2022-12-23

Human Activity Recognition (HAR) is one of the core research areas in mobile and wearable computing. With the application of deep learning (DL) techniques such as CNN, recognizing periodic or static activities (e.g, walking, lying, cycling, etc.) has become a well studied problem. What remains a major challenge though is the sporadic activity recognition (SAR) problem, where activities of interest tend to be non periodic, and occur less frequently when compared with the often large amount of irrelevant background activities. Recent works suggested that sequential DL models (such as LSTMs) have great potential for modeling nonperiodic behaviours, and in this paper we studied some LSTM training strategies for SAR. Specifically, we proposed two simple yet effective LSTM variants, namely delay model and inverse model, for two SAR scenarios (with and without time critical requirement). For time critical SAR, the delay model can effectively exploit predefined delay intervals (within tolerance) in form of contextual information for improved performance. For regular SAR task, the second proposed, inverse model can learn patterns from the time series in an inverse manner, which can be complementary to the forward model (i.e.,LSTM), and combining both can boost the performance. These two LSTM variants are very practical, and they can be deemed as training strategies without alteration of the LSTM fundamentals. We also studied some additional LSTM training strategies, which can further improve the accuracy. We evaluated our models on two SAR and one non-SAR datasets, and the promising results demonstrated the effectiveness of our approaches in HAR applications.

translated by 谷歌翻译

When and Why Test Generators for Deep Learning Produce Invalid Inputs: an Empirical Study

Vincenzo Riccio , Paolo Tonella

分类：机器学习

2022-12-21

Testing Deep Learning (DL) based systems inherently requires large and representative test sets to evaluate whether DL systems generalise beyond their training datasets. Diverse Test Input Generators (TIGs) have been proposed to produce artificial inputs that expose issues of the DL systems by triggering misbehaviours. Unfortunately, such generated inputs may be invalid, i.e., not recognisable as part of the input domain, thus providing an unreliable quality assessment. Automated validators can ease the burden of manually checking the validity of inputs for human testers, although input validity is a concept difficult to formalise and, thus, automate. In this paper, we investigate to what extent TIGs can generate valid inputs, according to both automated and human validators. We conduct a large empirical study, involving 2 different automated validators, 220 human assessors, 5 different TIGs and 3 classification tasks. Our results show that 84% artificially generated inputs are valid, according to automated validators, but their expected label is not always preserved. Automated validators reach a good consensus with humans (78% accuracy), but still have limitations when dealing with feature-rich datasets.

translated by 谷歌翻译

Uncertainty Quantification for Deep Neural Networks: An Empirical Comparison and Usage Guidelines

Michael Weiss , Paolo Tonella

分类：机器学习

2022-12-14

Deep Neural Networks (DNN) are increasingly used as components of larger software systems that need to process complex data, such as images, written texts, audio/video signals. DNN predictions cannot be assumed to be always correct for several reasons, among which the huge input space that is dealt with, the ambiguity of some inputs data, as well as the intrinsic properties of learning algorithms, which can provide only statistical warranties. Hence, developers have to cope with some residual error probability. An architectural pattern commonly adopted to manage failure-prone components is the supervisor, an additional component that can estimate the reliability of the predictions made by untrusted (e.g., DNN) components and can activate an automated healing procedure when these are likely to fail, ensuring that the Deep Learning based System (DLS) does not cause damages, despite its main functionality being suspended. In this paper, we consider DLS that implement a supervisor by means of uncertainty estimation. After overviewing the main approaches to uncertainty estimation and discussing their pros and cons, we motivate the need for a specific empirical assessment method that can deal with the experimental setting in which supervisors are used, where the accuracy of the DNN matters only as long as the supervisor lets the DLS continue to operate. Then we present a large empirical study conducted to compare the alternative approaches to uncertainty estimation. We distilled a set of guidelines for developers that are useful to incorporate a supervisor based on uncertainty monitoring into a DLS.

translated by 谷歌翻译

Towards a learning-based performance modeling for accelerating Deep Neural Networks

Damiano Perri , Paolo Sylos Labini , Osvaldo Gervasi , Sergio Tasso , Flavio Vella

分类：机器学习

2022-12-09

Emerging applications such as Deep Learning are often data-driven, thus traditional approaches based on auto-tuners are not performance effective across the wide range of inputs used in practice. In the present paper, we start an investigation of predictive models based on machine learning techniques in order to optimize Convolution Neural Networks (CNNs). As a use-case, we focus on the ARM Compute Library which provides three different implementations of the convolution operator at different numeric precision. Starting from a collation of benchmarks, we build and validate models learned by Decision Tree and naive Bayesian classifier. Preliminary experiments on Midgard-based ARM Mali GPU show that our predictive model outperforms all the convolution operators manually selected by the library.

translated by 谷歌翻译

Supervised Tractogram Filtering using Geometric Deep Learning

Pietro Astolfi , Ruben Verhagen , Laurent Petit , Emanuele Olivetti , Silvio Sarubbo , Jonathan Masci , Davide Boscaini , Paolo Avesani

分类：计算机视觉

2022-12-06

A tractogram is a virtual representation of the brain white matter. It is composed of millions of virtual fibers, encoded as 3D polylines, which approximate the white matter axonal pathways. To date, tractograms are the most accurate white matter representation and thus are used for tasks like presurgical planning and investigations of neuroplasticity, brain disorders, or brain networks. However, it is a well-known issue that a large portion of tractogram fibers is not anatomically plausible and can be considered artifacts of the tracking procedure. With Verifyber, we tackle the problem of filtering out such non-plausible fibers using a novel fully-supervised learning approach. Differently from other approaches based on signal reconstruction and/or brain topology regularization, we guide our method with the existing anatomical knowledge of the white matter. Using tractograms annotated according to anatomical principles, we train our model, Verifyber, to classify fibers as either anatomically plausible or non-plausible. The proposed Verifyber model is an original Geometric Deep Learning method that can deal with variable size fibers, while being invariant to fiber orientation. Our model considers each fiber as a graph of points, and by learning features of the edges between consecutive points via the proposed sequence Edge Convolution, it can capture the underlying anatomical properties. The output filtering results highly accurate and robust across an extensive set of experiments, and fast; with a 12GB GPU, filtering a tractogram of 1M fibers requires less than a minute. Verifyber implementation and trained models are available at https://github.com/FBK-NILab/verifyber.

translated by 谷歌翻译

Collision-tolerant Aerial Robots: A Survey

Paolo De Petris , Stephen J. Carlson , Christos Papachristos , Kostas Alexis

分类：机器人

2022-12-06

As aerial robots are tasked to navigate environments of increased complexity, embedding collision tolerance in their design becomes important. In this survey we review the current state-of-the-art within the niche field of collision-tolerant micro aerial vehicles and present different design approaches identified in the literature, as well as methods that have focused on autonomy functionalities that exploit collision resilience. Subsequently, we discuss the relevance to biological systems and provide our view on key directions of future fruitful research.

translated by 谷歌翻译